{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Part 7 - Loading Image Data (Exercises) - Modified - v1.ipynb",
      "version": "0.3.2",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true,
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/wx-chevalier/AIDL-Workbench/blob/master/pytorch/classification/dogs_and_cats.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Z81WK0LXBO6s",
        "colab_type": "text"
      },
      "source": [
        "\n",
        "# Loading Image Data\n",
        "\n",
        "\n",
        "## **NOTE:** \n",
        "\n",
        "This is slightly edited notebook to help you with downloaing and loading image datasets. I have left the exercise parts untouched and you have to fill them, ofcourse. If you have any issue with this notebook, feel free to contact me `avinash` on Slack :D\n",
        "\n",
        "So far we've been working with fairly artificial datasets that you wouldn't typically be using in real projects. Instead, you'll likely be dealing with full-sized images like you'd get from smart phone cameras. In this notebook, we'll look at how to load images and use them to train neural networks.\n",
        "\n",
        "We'll be using a [dataset of cat and dog photos](https://www.kaggle.com/c/dogs-vs-cats) available from Kaggle. Here are a couple example images:\n",
        "\n",
        "<img src='https://github.com/udacity/deep-learning-v2-pytorch/blob/master/intro-to-pytorch/assets/dog_cat.png?raw=1'>\n",
        "\n",
        "We'll use this dataset to train a neural network that can differentiate between cats and dogs. These days it doesn't seem like a big accomplishment, but five years ago it was a serious challenge for computer vision systems."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "1rAby9wMCPiy",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# google colab does not come with torch installed. And also, in course we are using torch 0.4. \n",
        "# so following snippet of code installs the relevant version\n",
        "\n",
        "from os.path import exists\n",
        "from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag\n",
        "platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())\n",
        "cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\\.\\([0-9]*\\)\\.\\([0-9]*\\)$/cu\\1\\2/'\n",
        "accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'\n",
        "\n",
        "!pip install -q http://download.pytorch.org/whl/{accelerator}/torch-0.4.1-{platform}-linux_x86_64.whl torchvision\n",
        "import torch"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "bg1OFdVAC9N9",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# we will verify that GPU is enabled for this notebook\n",
        "# following should print: CUDA is available!  Training on GPU ...\n",
        "# \n",
        "# if it prints otherwise, then you need to enable GPU: \n",
        "# from Menu > Runtime > Change Runtime Type > Hardware Accelerator > GPU\n",
        "\n",
        "import torch\n",
        "import numpy as np\n",
        "\n",
        "# check if CUDA is available\n",
        "train_on_gpu = torch.cuda.is_available()\n",
        "\n",
        "if not train_on_gpu:\n",
        "    print('CUDA is not available.  Training on CPU ...')\n",
        "else:\n",
        "    print('CUDA is available!  Training on GPU ...')"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "0BIESeZKDn67",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# we will download the cats and dogs dataset and unzip them\n",
        "# after this run, set your data_dir variable: data_dir = 'Cat_Dog_data'\n",
        "# so all the data is available. so path to your training data becomes `Cat_Dog_data/train`\n",
        "# path to test is `Cat_Dog_data/test`\n",
        "\n",
        "!wget -c https://s3.amazonaws.com/content.udacity-data.com/nd089/Cat_Dog_data.zip\n",
        "\n",
        "# remove existing directories\n",
        "!rm -r Cat_Dog_data __MACOSX || true\n",
        "!unzip -qq Cat_Dog_data.zip"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "CEArFmLdCD6b",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# This is the contents of helper.py which I have added here so that import is not needed\n",
        "\n",
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "from torch import nn, optim\n",
        "from torch.autograd import Variable\n",
        "\n",
        "\n",
        "def test_network(net, trainloader):\n",
        "\n",
        "    criterion = nn.MSELoss()\n",
        "    optimizer = optim.Adam(net.parameters(), lr=0.001)\n",
        "\n",
        "    dataiter = iter(trainloader)\n",
        "    images, labels = dataiter.next()\n",
        "\n",
        "    # Create Variables for the inputs and targets\n",
        "    inputs = Variable(images)\n",
        "    targets = Variable(images)\n",
        "\n",
        "    # Clear the gradients from all Variables\n",
        "    optimizer.zero_grad()\n",
        "\n",
        "    # Forward pass, then backward pass, then update weights\n",
        "    output = net.forward(inputs)\n",
        "    loss = criterion(output, targets)\n",
        "    loss.backward()\n",
        "    optimizer.step()\n",
        "\n",
        "    return True\n",
        "\n",
        "\n",
        "def imshow(image, ax=None, title=None, normalize=True):\n",
        "    \"\"\"Imshow for Tensor.\"\"\"\n",
        "    if ax is None:\n",
        "        fig, ax = plt.subplots()\n",
        "    image = image.numpy().transpose((1, 2, 0))\n",
        "\n",
        "    if normalize:\n",
        "        mean = np.array([0.485, 0.456, 0.406])\n",
        "        std = np.array([0.229, 0.224, 0.225])\n",
        "        image = std * image + mean\n",
        "        image = np.clip(image, 0, 1)\n",
        "\n",
        "    ax.imshow(image)\n",
        "    ax.spines['top'].set_visible(False)\n",
        "    ax.spines['right'].set_visible(False)\n",
        "    ax.spines['left'].set_visible(False)\n",
        "    ax.spines['bottom'].set_visible(False)\n",
        "    ax.tick_params(axis='both', length=0)\n",
        "    ax.set_xticklabels('')\n",
        "    ax.set_yticklabels('')\n",
        "\n",
        "    return ax\n",
        "\n",
        "\n",
        "def view_recon(img, recon):\n",
        "    ''' Function for displaying an image (as a PyTorch Tensor) and its\n",
        "        reconstruction also a PyTorch Tensor\n",
        "    '''\n",
        "\n",
        "    fig, axes = plt.subplots(ncols=2, sharex=True, sharey=True)\n",
        "    axes[0].imshow(img.numpy().squeeze())\n",
        "    axes[1].imshow(recon.data.numpy().squeeze())\n",
        "    for ax in axes:\n",
        "        ax.axis('off')\n",
        "        ax.set_adjustable('box-forced')\n",
        "\n",
        "def view_classify(img, ps, version=\"MNIST\"):\n",
        "    ''' Function for viewing an image and it's predicted classes.\n",
        "    '''\n",
        "    ps = ps.data.numpy().squeeze()\n",
        "\n",
        "    fig, (ax1, ax2) = plt.subplots(figsize=(6,9), ncols=2)\n",
        "    ax1.imshow(img.resize_(1, 28, 28).numpy().squeeze())\n",
        "    ax1.axis('off')\n",
        "    ax2.barh(np.arange(10), ps)\n",
        "    ax2.set_aspect(0.1)\n",
        "    ax2.set_yticks(np.arange(10))\n",
        "    if version == \"MNIST\":\n",
        "        ax2.set_yticklabels(np.arange(10))\n",
        "    elif version == \"Fashion\":\n",
        "        ax2.set_yticklabels(['T-shirt/top',\n",
        "                            'Trouser',\n",
        "                            'Pullover',\n",
        "                            'Dress',\n",
        "                            'Coat',\n",
        "                            'Sandal',\n",
        "                            'Shirt',\n",
        "                            'Sneaker',\n",
        "                            'Bag',\n",
        "                            'Ankle Boot'], size='small');\n",
        "    ax2.set_title('Class Probability')\n",
        "    ax2.set_xlim(0, 1.1)\n",
        "\n",
        "    plt.tight_layout()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "H6fFWYhfBO6v",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "%matplotlib inline\n",
        "%config InlineBackend.figure_format = 'retina'\n",
        "\n",
        "import matplotlib.pyplot as plt\n",
        "\n",
        "import torch\n",
        "from torchvision import datasets, transforms"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AKcM7GaMBO6z",
        "colab_type": "text"
      },
      "source": [
        "The easiest way to load image data is with `datasets.ImageFolder` from `torchvision` ([documentation](http://pytorch.org/docs/master/torchvision/datasets.html#imagefolder)). In general you'll use `ImageFolder` like so:\n",
        "\n",
        "```python\n",
        "dataset = datasets.ImageFolder('path/to/data', transform=transform)\n",
        "```\n",
        "\n",
        "where `'path/to/data'` is the file path to the data directory and `transform` is a list of processing steps built with the [`transforms`](http://pytorch.org/docs/master/torchvision/transforms.html) module from `torchvision`. ImageFolder expects the files and directories to be constructed like so:\n",
        "```\n",
        "root/dog/xxx.png\n",
        "root/dog/xxy.png\n",
        "root/dog/xxz.png\n",
        "\n",
        "root/cat/123.png\n",
        "root/cat/nsdf3.png\n",
        "root/cat/asd932_.png\n",
        "```\n",
        "\n",
        "where each class has it's own directory (`cat` and `dog`) for the images. The images are then labeled with the class taken from the directory name. So here, the image `123.png` would be loaded with the class label `cat`. You can download the dataset already structured like this [from here](https://s3.amazonaws.com/content.udacity-data.com/nd089/Cat_Dog_data.zip). I've also split it into a training set and test set.\n",
        "\n",
        "### Transforms\n",
        "\n",
        "When you load in the data with `ImageFolder`, you'll need to define some transforms. For example, the images are different sizes but we'll need them to all be the same size for training. You can either resize them with `transforms.Resize()` or crop with `transforms.CenterCrop()`, `transforms.RandomResizedCrop()`, etc. We'll also need to convert the images to PyTorch tensors with `transforms.ToTensor()`. Typically you'll combine these transforms into a pipeline with `transforms.Compose()`, which accepts a list of transforms and runs them in sequence. It looks something like this to scale, then crop, then convert to a tensor:\n",
        "\n",
        "```python\n",
        "transform = transforms.Compose([transforms.Resize(255),\n",
        "                                 transforms.CenterCrop(224),\n",
        "                                 transforms.ToTensor()])\n",
        "\n",
        "```\n",
        "\n",
        "There are plenty of transforms available, I'll cover more in a bit and you can read through the [documentation](http://pytorch.org/docs/master/torchvision/transforms.html). \n",
        "\n",
        "### Data Loaders\n",
        "\n",
        "With the `ImageFolder` loaded, you have to pass it to a [`DataLoader`](http://pytorch.org/docs/master/data.html#torch.utils.data.DataLoader). The `DataLoader` takes a dataset (such as you would get from `ImageFolder`) and returns batches of images and the corresponding labels. You can set various parameters like the batch size and if the data is shuffled after each epoch.\n",
        "\n",
        "```python\n",
        "dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True)\n",
        "```\n",
        "\n",
        "Here `dataloader` is a [generator](https://jeffknupp.com/blog/2013/04/07/improve-your-python-yield-and-generators-explained/). To get data out of it, you need to loop through it or convert it to an iterator and call `next()`.\n",
        "\n",
        "```python\n",
        "# Looping through it, get a batch on each loop \n",
        "for images, labels in dataloader:\n",
        "    pass\n",
        "\n",
        "# Get one batch\n",
        "images, labels = next(iter(dataloader))\n",
        "```\n",
        " \n",
        ">**Exercise:** Load images from the `Cat_Dog_data/train` folder, define a few transforms, then build the dataloader."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "N9BS-yYFBO60",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "data_dir = 'Cat_Dog_data/train'\n",
        "\n",
        "transform = # TODO: compose transforms here\n",
        "dataset = # TODO: create the ImageFolder\n",
        "dataloader = # TODO: use the ImageFolder dataset to create the DataLoader"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ulFq6oNZBO63",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# Run this to test your data loader\n",
        "images, labels = next(iter(dataloader))\n",
        "imshow(images[0], normalize=False)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "l4ZGB8vUBO65",
        "colab_type": "text"
      },
      "source": [
        "If you loaded the data correctly, you should see something like this (your image will be different):\n",
        "\n",
        "<img src='https://github.com/udacity/deep-learning-v2-pytorch/blob/master/intro-to-pytorch/assets/cat_cropped.png?raw=1' width=244>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8UK5pmGfBO67",
        "colab_type": "text"
      },
      "source": [
        "## Data Augmentation\n",
        "\n",
        "A common strategy for training neural networks is to introduce randomness in the input data itself. For example, you can randomly rotate, mirror, scale, and/or crop your images during training. This will help your network generalize as it's seeing the same images but in different locations, with different sizes, in different orientations, etc.\n",
        "\n",
        "To randomly rotate, scale and crop, then flip your images you would define your transforms like this:\n",
        "\n",
        "```python\n",
        "train_transforms = transforms.Compose([transforms.RandomRotation(30),\n",
        "                                       transforms.RandomResizedCrop(224),\n",
        "                                       transforms.RandomHorizontalFlip(),\n",
        "                                       transforms.ToTensor(),\n",
        "                                       transforms.Normalize([0.5, 0.5, 0.5], \n",
        "                                                            [0.5, 0.5, 0.5])])\n",
        "```\n",
        "\n",
        "You'll also typically want to normalize images with `transforms.Normalize`. You pass in a list of means and list of standard deviations, then the color channels are normalized like so\n",
        "\n",
        "```input[channel] = (input[channel] - mean[channel]) / std[channel]```\n",
        "\n",
        "Subtracting `mean` centers the data around zero and dividing by `std` squishes the values to be between -1 and 1. Normalizing helps keep the network work weights near zero which in turn makes backpropagation more stable. Without normalization, networks will tend to fail to learn.\n",
        "\n",
        "You can find a list of all [the available transforms here](http://pytorch.org/docs/0.3.0/torchvision/transforms.html). When you're testing however, you'll want to use images that aren't altered (except you'll need to normalize the same way). So, for validation/test images, you'll typically just resize and crop.\n",
        "\n",
        ">**Exercise:** Define transforms for training data and testing data below."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "qMlykOoHBO67",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "data_dir = 'Cat_Dog_data'\n",
        "\n",
        "# TODO: Define transforms for the training data and testing data\n",
        "train_transforms = \n",
        "\n",
        "test_transforms = \n",
        "\n",
        "\n",
        "# Pass transforms in here, then run the next cell to see how the transforms look\n",
        "train_data = datasets.ImageFolder(data_dir + '/train', transform=train_transforms)\n",
        "test_data = datasets.ImageFolder(data_dir + '/test', transform=test_transforms)\n",
        "\n",
        "trainloader = torch.utils.data.DataLoader(train_data, batch_size=32)\n",
        "testloader = torch.utils.data.DataLoader(test_data, batch_size=32)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "YuN17PwfBO6-",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# change this to the trainloader or testloader \n",
        "data_iter = iter(testloader)\n",
        "\n",
        "images, labels = next(data_iter)\n",
        "fig, axes = plt.subplots(figsize=(10,4), ncols=4)\n",
        "for ii in range(4):\n",
        "    ax = axes[ii]\n",
        "    imshow(images[ii], ax=ax, normalize=False)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NRiC7KBxBO7B",
        "colab_type": "text"
      },
      "source": [
        "Your transformed images should look something like this.\n",
        "\n",
        "<center>Training examples:</center>\n",
        "<img src='https://github.com/udacity/deep-learning-v2-pytorch/blob/master/intro-to-pytorch/assets/train_examples.png?raw=1' width=500px>\n",
        "\n",
        "<center>Testing examples:</center>\n",
        "<img src='https://github.com/udacity/deep-learning-v2-pytorch/blob/master/intro-to-pytorch/assets/test_examples.png?raw=1' width=500px>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xiXuhzXnBO7D",
        "colab_type": "text"
      },
      "source": [
        "At this point you should be able to load data for training and testing. Now, you should try building a network that can classify cats vs dogs. This is quite a bit more complicated than before with the MNIST and Fashion-MNIST datasets. To be honest, you probably won't get it to work with a fully-connected network, no matter how deep. These images have three color channels and at a higher resolution (so far you've seen 28x28 images which are tiny).\n",
        "\n",
        "In the next part, I'll show you how to use a pre-trained network to build a model that can actually solve this problem."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jqJuK625BO7D",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# Optional TODO: Attempt to build a network to classify cats vs dogs from this dataset"
      ],
      "execution_count": 0,
      "outputs": []
    }
  ]
}